Ingolstadt
On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey
Zhang, Meishan, Zhang, Xin, Zhao, Xinping, Huang, Shouzheng, Hu, Baotian, Zhang, Min
Text embeddings have attracted growing interest due to their effectiveness across a wide range of natural language processing (NLP) tasks, including retrieval, classification, clustering, bitext mining, and summarization. With the emergence of pretrained language models (PLMs), general-purpose text embeddings (GPTE) have gained significant traction for their ability to produce rich, transferable representations. The general architecture of GPTE typically leverages PLMs to derive dense text representations, which are then optimized through contrastive learning on large-scale pairwise datasets. In this survey, we provide a comprehensive overview of GPTE in the era of PLMs, focusing on the roles PLMs play in driving its development. We first examine the fundamental architecture and describe the basic roles of PLMs in GPTE, i.e., embedding extraction, expressivity enhancement, training strategies, learning objectives, and data construction. We then describe advanced roles enabled by PLMs, including multilingual support, multimodal integration, code understanding, and scenario-specific adaptation. Finally, we highlight potential future research directions that move beyond traditional improvement goals, including ranking integration, safety considerations, bias mitigation, structural information incorporation, and the cognitive extension of embeddings. This survey aims to serve as a valuable reference for both newcomers and established researchers seeking to understand the current state and future potential of GPTE.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > China > Heilongjiang Province > Harbin (0.04)
- (17 more...)
- Overview (1.00)
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.45)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.45)
SAGE: An Agentic Explainer Framework for Interpreting SAE Features in Language Models
Han, Jiaojiao, Xu, Wujiang, Jin, Mingyu, Du, Mengnan
Large language models (LLMs) have achieved remarkable progress, yet their internal mechanisms remain largely opaque, posing a significant challenge to their safe and reliable deployment. Sparse autoencoders (SAEs) have emerged as a promising tool for decomposing LLM representations into more interpretable features, but explaining the features captured by SAEs remains a challenging task. In this work, we propose SAGE (SAE AGentic Explainer), an agent-based framework that recasts feature interpretation from a passive, single-pass generation task into an active, explanation-driven process. SAGE implements a rigorous methodology by systematically formulating multiple explanations for each feature, designing targeted experiments to test them, and iteratively refining explanations based on empirical activation feedback. Experiments on features from SAEs of diverse language models demonstrate that SAGE produces explanations with significantly higher generative and predictive accuracy compared to state-of-the-art baselines.an agent-based framework that recasts feature interpretation from a passive, single-pass generation task into an active, explanationdriven process. SAGE implements a rigorous methodology by systematically formulating multiple explanations for each feature, designing targeted experiments to test them, and iteratively refining explanations based on empirical activation feedback. Experiments on features from SAEs of diverse language models demonstrate that SAGE produces explanations with significantly higher generative and predictive accuracy compared to state-of-the-art baselines.
- Europe > Austria > Vienna (0.14)
- Asia > China (0.05)
- North America > United States > New Jersey (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (2 more...)
- North America > United States > California > Los Angeles County > Los Angeles (0.27)
- North America > United States > New York > New York County > New York City (0.14)
- Asia > Middle East > Jordan (0.04)
- (4 more...)
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (3 more...)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)
- Research Report (0.68)
- Instructional Material (0.46)
- Health & Medicine (0.68)
- Transportation > Infrastructure & Services (0.50)
- Transportation > Ground > Road (0.50)
- Europe > Germany > Berlin (0.15)
- Europe > Germany > Schleswig-Holstein (0.07)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.06)
- (26 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Asia > China > Beijing > Beijing (0.04)